Search CORE

39 research outputs found

k-Anonymity in the Presence of External Databases

Author: MOURATIDIS Kyriakos
Papadias Dimitris
SACHARIDIS Dimitris
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

The concept of k-anonymity has received considerable attention due to the need of several organizations to release microdata without revealing the identity of individuals. Although all previous k-anonymity techniques assume the existence of a public database (P D) that can be used to breach privacy, none utilizes P D during the anonymization process. Specifically, existing generalization algorithms create anonymous tables using only the microdata table (MT) to be published, independently of the external knowledge available. This omission leads to high information loss. Motivated by this observation we first introduce the concept of k-join-anonymity (KJA), which permits more effective generalization to reduce the information loss. Briefly, KJA anonymizes a superset of MT, which includes selected records from P D. We propose two methodologies for adapting k-anonymity algorithms to their KJA counterparts. The first generalizes the combination of MT and P D, under the constraint that each group should contain at least one tuple of MT (otherwise, the group is useless and discarded). The second anonymizes MT, and then refines the resulting groups using P D. Finally, we evaluate the effectiveness of our contributions with an extensive experimental evaluation using real and synthetic datasets

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

DSpace at NTUA

Hong Kong University of Science and Technology Institutional Repository

TokenJoin:Efficient Filtering for Set Similarity Join with MaximumWeighted Bipartite Matching

Author: Koubarakis Manolis
Papapetrou Odysseas
Sacharidis Dimitris
Skoutas Dimitrios
Zeakis Alexandros
Publication venue
Publication date: 01/12/2022
Field of study

Set similarity join is an important problem with many applications in data discovery, cleaning and integration. To increase robustness, fuzzy set similarity join calculates the similarity of two sets based on maximum weighted bipartite matching instead of set overlap. This allows pairs of elements, represented as sets or strings, to also match approximately rather than exactly, e.g., based on Jaccard similarity or edit distance. However, this significantly increases the verification cost, making even more important the need for efficient and effective filtering techniques to reduce the number of candidate pairs. The current state-of-the-art algorithm relies on similarity computations between pairs of elements to filter candidates. In this paper, we propose token-based instead of element-based filtering, showing that it is significantly more lightweight, while offering similar or even better pruning effectiveness. Moreover, we address the top-k variant of the problem, alleviating the need for a userspecified similarity threshold. We also propose early termination to reduce the cost of verification. Our experimental results on six real-world datasets show that our approach always outperforms the state of the art, being an order of magnitude faster on average.</p

Pure OAI Repository

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Fairness Aware Counterfactuals for Subgroups

Author: Emiris Ioannis
Fotakis Dimitris
Giannopoulos Giorgos
Kavouras Loukas
Psaroudaki Eleni
Rontogiannis Dimitrios
Sacharidis Dimitris
Theologitis Nikolaos
Tsopelas Konstantinos
Publication venue
Publication date: 26/06/2023
Field of study

In this work, we present Fairness Aware Counterfactuals for Subgroups (FACTS), a framework for auditing subgroup fairness through counterfactual explanations. We start with revisiting (and generalizing) existing notions and introducing new, more refined notions of subgroup fairness. We aim to (a) formulate different aspects of the difficulty of individuals in certain subgroups to achieve recourse, i.e. receive the desired outcome, either at the micro level, considering members of the subgroup individually, or at the macro level, considering the subgroup as a whole, and (b) introduce notions of subgroup fairness that are robust, if not totally oblivious, to the cost of achieving recourse. We accompany these notions with an efficient, model-agnostic, highly parameterizable, and explainable framework for evaluating subgroup fairness. We demonstrate the advantages, the wide applicability, and the efficiency of our approach through a thorough experimental evaluation of different benchmark datasets

arXiv.org e-Print Archive

Interactivity, Fairness and Explanations in Recommendations

Author: Giannopoulos Giorgos
Papastefanatos George
Sacharidis Dimitris
Stefanidis Kostas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2021
Field of study

More and more aspects of our everyday lives are influenced by automated decisions made by systems that statistically analyze traces of our activities. It is thus natural to question whether such systems are trustworthy, particularly given the opaqueness and complexity of their internal workings. In this paper, we present our ongoing work towards a framework that aims to increase trust in machine-generated recommendations by combining ideas from three separate recent research directions, namely explainability, fairness and user interactive visualization. The goal is to enable different stakeholders, with potentially varying levels of background and diverse needs, to query, understand, and fix sources of distrust.acceptedVersionPeer reviewe

Trepo - Institutional Repository of Tampere University

Recommended from our members

New Trends in Scientific Knowledge Graphs and Research Impact Assessment

Author: Manghi Paolo
Mannocci Andrea
Osborne Francesco
Sacharidis Dimitris
Salatino Angelo
Vergoulis Thanasis
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2021
Field of study

Open Research Online (The Open University)

On-Line Discovery of Hot Motion Paths

Author: Kantere Verena
MOURATIDIS Kyriakos
Patroumpas Kostas
Potamias Michalis
SACHARIDIS Dimitris
Sellis Timos
Terrovitis Manolis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

We consider an environment of numerous moving objects, equipped with location-sensing devices and capable of communicating with a central coordinator. In this setting, we investigate the problem of maintaining hot motion paths, i.e., routes frequently followed by multiple objects over the recent past. Motion paths approximate portions of objects' movement within a tolerance margin that depends on the uncertainty inherent in positional measurements. Discovery of hot motion paths is important to applications requiring classification/profiling based on monitored movement patterns, such as targeted advertising, resource allocation, etc. To achieve this goal, we delegate part of the path extraction process to objects, by assigning to them adaptive lightweight filters that dynamically suppress unnecessary location updates and, thus, help reducing the communication overhead. We demonstrate the benefits of our methods and their efficiency through extensive experiments on synthetic data sets

Crossref

Institutional Knowledge at Singapore Management University

DSpace at NTUA

Swinburne Research Bank

Partially Materialized Digest Scheme: An Efficient Verification Method for Outsourced Databases

Author: C.U. Martel
D. Comer
Dimitris Sacharidis
E. Bertino
F. Korn
G.R. Hjaltason
HweeHwa Pang
Kyriakos Mouratidis
M. Berg de
P.T. Devanbu
R.L. Rivest
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

Institutional Knowledge at Singapore Management University

DSpace at NTUA

Databases and Information Systems in the AI Era: Contributions from ADBIS, TPDL and EDA 2020 Workshops and Doctoral Consortium

Author: Bellatreche Ladjel
Bentayeb Fadila
Bieliková Mária
Boussaid Omar
Catania Barbara
Ceravolo Paolo
Demidova Elena
Gomez Lopez Maria Teresa
Halfeld Ferrari Mirian
Kordić Slavica
Luković Ivan
Manghi Paolo
Mannocci Andrea
Osborne Francesco
Papatheodorou Christos
Ristić Sonja
Romero Oscar
S. Hara Carmem
Sacharidis Dimitris
Salatino Angelo
Talens Guilaine
van Keulen Maurice
Vergoulis Thanasis
Zumer Maja
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Research on database and information technologies has been rapidly evolving over the last couple of years. This evolution was lead by three major forces: Big Data, AI and Connected World that open the door to innovative research directions and challenges, yet exploiting four main areas: (i) computational and storage resource modeling and organization; (ii) new programming models, (iii) processing power and (iv) new applications that emerge related to health, environment, education, Cultural Heritage, Banking, etc. The 24th East-European Conference on Advances in Databases and Information Systems (ADBIS 2020), the 24th International Conference on Theory and Practice of Digital Libraries (TPDL 2020) and the 16th Workshop on Business Intelligence and Big Data (EDA 2020), held during August 25–27, 2020, at Lyon, France, and associated satellite events aimed at covering some emerging issues related to database and information system research in these areas. The aim of this paper is to present such events, their motivations, and topics of interest, as well as briefly outline the papers selected for presentations. The selected papers will then be included in the remainder of this volume

Crossref

AIR Universita degli studi di Milano

Open Research Online (The Open University)

Archivio istituzionale della ricerca - Università di Genova

University of Twente Research Information

idUS. Depósito de Investigación Universidad de Sevilla

Towards Mobility Data Science (Vision Paper)

Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences. In this paper, we present the emerging domain of mobility data science. Towards a unified approach to mobility data science, we envision a pipeline having the following components: mobility data collection, cleaning, analysis, management, and privacy. For each of these components, we explain how mobility data science differs from general data science, we survey the current state of the art and describe open challenges for the research community in the coming years.Comment: Updated arXiv metadata to include two authors that were missing from the metadata. PDF has not been change

arXiv.org e-Print Archive